Random action of compact Lie groups and minimax estimation of 

a mean pattern 



Jeremie Bigot, Claire Christophe and Sebastien Gadat 

Institut de Mathematiques de Toulouse 
Universite de Toulouse et CNRS (UMR 5219) 
31062 Toulouse, Cedex 9, France 

{Jeremie . Bigot , Claire . Christophe , Sebastien . GadatjSmath . univ-toulouse . f r 

October 18, 2011 



Abstract 

This paper considers the problem of estimating a mean pattern in the setting of Grenan- 
der's pattern theory. Shape variabihty in a data set of curves or images is modeled by the 
random action of elements in a compact Lie group on an infinite dimensional space. In the 
case of observations contaminated by an additive Gaussian white noise, it is shown that 
estimating a reference template in the setting of Grenander's pattern theory falls into the 
category of deconvolution problems over Lie groups. To obtain this result, we build an esti- 
mator of a mean pattern by using Fourier deconvolution and harmonic analysis on compact 
Lie groups. In an asymptotic setting where the number of observed curves or images tends 
to infinity, we derive upper and lower bounds for the minimax quadratic risk over Sobolev 
balls. This rate depends on the smoothness of the density of the random Lie group elements 
representing shape variability in the data, which makes a connection between estimating a 
mean pattern and standard deconvolution problems in nonparametric statistics. 
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1 Introduction 



In signal and image processing, data are often in the form of a set of n curves or images 
Yi, . . . ,Yn. In many applications, observed curves or images have a similar structure which may 
lead to the assumption that these observations are random elements which vary around the 
same mean pattern (also called reference template). However, due to additive noise and shape 
variability in the data, this mean pattern is typically unknown and has to be estimated. In 
this setting, a widely used approach is Grenander's pattern theory jGre931 IGM07j which models 
shape variability by the action of a Lie group on an infinite dimensional space of curves or 
images. In the last decade, the study of transformation Lie groups to model shape variability of 
images has been an active research field, and we refer to |TY051 ITYllj for a recent overview of 
the theory of deformable templates. Currently, there is also a growing interest in statistics on the 
problem of estimating the mean pattern of a set of curves or images using deformable templates 
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[AATOTl lAKTlOl IBGlOl IBGL09[ IBGVOQi IMMTYOSj . In this paper, we focus on the problem 
of constructing asymptotically minimax estimators of a mean pattern using noncommutative 
Lie groups to model shape variability. The main goal of this paper is to show that estimating 
a reference template in the setting of Grenander's pattern theory falls into the category of 
deconvolution problems over Lie groups as formulated in [KK08| . 

To be more precise, let G be a connected and compact Lie group. Let L^(G') be the Hilbert 
space of complex valued, square integrable functions on the group G with respect to the Haar 
measure dg. We propose to study the nonparametric estimation of a complex valued function 

: G — 7- C in the following deformable white noise model 

dYM = fm{g)dg + edWrn{g), g£G,mGll,nl (LI) 

where 

U9)=nr-'g). 

The Tm's are independent and identically distributed (i.i.d) random variables belonging to G and 
the Wm's are independent standard Brownian sheets on the topological space G with reference 
measure d^. For all m = 1, . . . , n, Tm is also supposed to be independent of Wm- 

In (jl.ip the function /* is the unknown mean pattern to estimate in the asymptotic setting 
n — )• +00, and L^(G) represents an infinite dimensional space of curves or images. The s are 
random variables acting on L?{G) and they model shape variability in he data. The Wm model 
intensity variability in the observed curves or images. In what follows, the random variables Tm 
are also supposed to have a known density h £ L-^(G). We will show that h plays the role of the 
kernel a convolution operator that has to be inverted to construct an optimal (in the minimax 
sense) estimator of /*. Indeed, since Wm has zero expectation, it follows that the expectation 
of the m-th observation in (jl.ip is equal to 

^fM = [ nT-^g)h{T)dT for any m G 
JG 

Therefore, E.fm{g) = /* * /i is the convolution over the group G between the function /* and 
the density h. Hence, we propose to build an estimator of f* using a regularized deconvolu- 
tion method over Lie groups. This class of inverse problems is based on the use of harmonic 
analysis and Fourier analysis on compact Lie groups to transform convolution in a product of 
Fourier coefficients. However, unlike standard Fourier deconvolution on the torus, when G is 
not a commutative group, the Fourier coefficients of a function in L^(G) are no longer complex 
coefficients but grow in dimension with increasing "frequency". This somewhat complicates 
both the inversion process and the study of the asymptotic minimax properties of the resulting 
estimators. 

In (BLVlOj . a model similar to (jl.ip has been studied where n is held fixed, and the Tm s are 
not random but deterministic parameters to be estimated in the asymptotic setting e — )• using 
semi-parametric statistics techniques. The potential of using noncommutative harmonic analysis 
for various applications in engineering is well described in [GKOlj . The contribution of this paper 
is thus part of the growing interest in nonparametric statistics and inverse problems on the use 
of harmonic analysis on Lie groups [ Kim98[ IKK02[ IKKOSi IKROli ILKKKlll IPMRGlOllYaiOi] . 

Our construction of an estimator of the mean pattern in (jl.ip is inspired by the following 
problem of stochastic deconvolution over Lie groups introduced in |KKn8j : estimate /* G 1?{G) 
from the regression model 

Vj = [ r{T-^9j)HT) dr + rjj, g, £ G, j e [1, nj (1.2) 
JG 

where /i is a known convolution kernel, the gj^s are "design points" in G, and the r/j's are 
independent realizations of a random noise process with zero mean and finite variance. In 
|KK08] a notion of asymptotic minimaxity over L^(G) is introduced, and the authors derive 
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upper and lower bounds for a minimax risk over Sobolev balls. In this paper we also introduce 
a notion of minimax risk in model However, deriving upper and lower bounds of the 

minimax risk for the estimation of /* is significantly more difficult in (jl.ip than in model ()1.2p . 
This is due to the fact that there are two sources of noise in model p.ip : a source of additive 
Gaussian noise Wm which is a classical one for studying minimax properties of an estimator, and 
a source of shape variability due to the r^'s which is much more difficult to treat. In particular, 
standard methods to derive lower bounds of the minimax risk in classical white noise models 
such as Fano's Lemma are not adapted to the source of shape variability in (jl.ip . We show that 
one may use the Assouad's cube technique (see e.g. |Tsy09| and references therein), but it has 
to be carefully adapted to model (jl.ip . 

The paper is organized as follows. In Section [21 we describe the construction of our estimator 
using a deconvolution step and Fourier analysis on compact Lie groups. We also define a notion 
of asymptotic optimality in the minimax sense for estimators of the mean pattern. In Section [31 
we derive an upper bound on the minimax risk that depends on smoothness assumptions on the 
density h. A lower bound on the minimax risk is also given. All proofs are gathered in a technical 
appendix. At the end of the paper, we have also included some technical materials about Fourier 
analysis on compact Lie groups, along with some formula for the rate of convergence of the 
eigenvalues of the Laplace-Beltrami operator which are needed to derive our asymptotic rates 
of convergence. 

2 Mean pattern estimation via deconvolution on Lie groups 

In this section, we use various concepts from harmonic analysis on Lie groups which are defined 
in Appendix iBl 

2.1 Sobolev space in L^(G') 

Let G be the set of equivalence classes of irreducible representations of G that is identified to 
the set of unitary representations of each class. For n G G and g £ G one has that 7r{g) G 
GLfi^y,d^{C) (the set of x d-,^ nonsingular matrices with complex entries) where is the 
dimension of vr. By the Peter- Weyl theorem (see Appendix IB.2p . any function / G 'L?{G) can 
be decomposed as 



where Tr is the trace operator and CT^{f) = Jq f{g)TT{g^^) dg is the vr-th Fourier coefficient of 
f {a d-Tj X dj^ matrix). The decomposition formula (|2.ip is an analogue of the usual Fourier 
analysis in L^([0, 1]) which corresponds to the situation G = M/Z (the torus in dimension 1) 
for which G = 1^, the representations vr are the usual trigonometric polynomials '^'{g) = e*^"^^^ 
for some £ G Z (with the bold symbol vr denoting the number Pi). In this case, the matrices 
C7r(/) are one-dimensional {dT^ = 1) and they equal the standard Fourier coefficients C7r(/) = 
q(/) = Jq f{g)^~^^^^^ dg. For G = M/Z, one thus retrieves the classical Fourier decomposition 
of a periodic function / : [0, 1] ^ M as f{g) = Y.iez cK/)e*^''^^- 

Definition 2.1. Let k G N*. Let A £ Ai^xki^) (the set oikxk matrices with complex entries). 
The Frobenius norm of A is defined by \\A\\], = ^Tr (^^*)- It is the norm induced by the 
inner product {A,B)p = Tr {AB^) of two matrices A,B£ M.kxkiC). 



By Parseval's relation, it follows that = 11/11^2(0 — Ig \f (9)1^(^9 = E^eg I|c7r(/)||^ 



for any / G L^(G). The following definitions of a Sobolev norm and Sobolev spaces have been 
proposed in |KK08j . 




(2.1) 



TveG 
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Definition 2.2. Let / G L^(G) and s > dim(G)/2. The Sobolev norm of order s of / is defined 

^y\\f\\k= Iclfiardg + Z^^QKci^Tr (cM)'^') = lG\fi9)\'dg + E.^GKd.\\cM)\\l, 
where Ajr is the eigenvalue value of vr associated to the Laplace-Beltrami operator induced by 
the Riemannian structure of the Lie group G. 

Definition 2.3. Let s > dim(G)/2 and denote by C°°{G) the space of infinitely differentiable 
functions on G. The Sobolev space Hs{G) of order s is the completion of C°°(G) with respect 
to the norm || • \\hs- Let ^ > 0. The Sobolev ball of radius A and order S in L^(G) is defined as 

Hs{G,A) = {feHs(,G) : \\ffH^<A'}. 

It can be checked that Hs{G) corresponds to the usual notion of a Sobolev space in the case 
G = M/Z. Now, let / G L'^(G) be an estimator of f* i.e. a measurable mapping of the random 
processes = 1, . . . ,n taking its value in L'^(G). The quadratic risk of an estimator / is 

defined as 

n = E (ii/ - rf) = ^(^J^ iKg) - r(5)i' 

Definition 2.4. The minimax risk over Sobolev balls associated to model (jl.ip is defined as 

nn{A,s)= inf sup R{f,n, 

feh^G) f*eHsiG,A) 

where the above infimum is taken over the set all estimators. 

The main goal of this paper is then to derive asymptotic upper and lower bounds on the 
minimax risk TZn{A, s) as n — )• +oo. 



2.2 Construction of the estimator 

First, note that the white noise model (jl.ip has to be interpreted in the following sense: let 
/ G L?{G), then conditionally to each integral f{g) dYm{g) of the "data" dYm{g) is a ran- 
dom variable normally distributed with mean Jq f{g)f*{T^g)dg and variance jQ\f{g)\'^dg. 

Moreover, E(/^/i(g) dH^^(ff)/^/2 (5) diy™ (5)) = dff for /i,/2 G L2(G) and any 

m G |l,n]. Therefore, using Fourier analysis on compact Lie groups, one may re-write model 
(jl.ip in the Fourier domain as 

c^{Ym) = / vr(5-i) = Cnifm) + ec^iWm), for ^ G G and m G Il,n], (2.2) 

Jg 

where 

Cn{fm)= / frn{9)Tr{g'^) dg and C^{Wm) = / TT{g''^)dWm{g)- 
Jg Jg 

Note that CT,{fm) = Jq f*{T~''g)'K{g-'^) dg = f*{g)T^{{Tmg)~^) dg which implies that 

Cnifm) = c^(/*)^(t„^), m G [l,nl. 

Remark also that the coefficients {cTt{Wm))k,i of the matrix CTriWm) G A^d^,(i^(C) are indepen- 
dent complex random variables that are normally distributed with zero expectation and variance 
d~^. Moreover, note that 

E (^(^m')) = Cnih) and E(c,(y„)) = c,(r)c,(/i). 

Therefore, if we assume that CTt{h) is an invertible matrix, it follows that an unbiased estimator 
of the the 7r-th Fourier coefficient of f* is given by the following deconvolution step in the Fourier 
domain 

= -y] c^{Yra)c^{h)-\ (2.3) 

m=l 
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An estimator of /* can then be constructed by defining for g £ G 



lliZ Yl '^-Tr {7Tig)c^iYm)c^ihy^) , (2.4) 



"^=1 weGr 



where Gt = {vr e G : < t} for some T > whose choice has to be discussed (note that 
the cardinal of Gt is finite) . 



2.3 Regularity assumptions on the density h 

It is well-known that the difficulty of a deconvolution problem is quantified by the smoothness 
of the convolution kernel. The rate of convergence that can be expected from any estimator 
depends on such smoothness assumptions. This issue has been well studied in the nonpar ametric 
statistics literature on standard deconvolution problems (see e.g. |Fan91] ). Following the ap- 
proach proposed in |KK08j . we now discuss a smoothness assumption on the convolution kernel 
h. 

Definition 2.5. Let A; G N* and |.|2 be the standard Euclidean norm on C'^. The operator norm 
of yl G Mkxk{C) is \\A\\op = sup^^o 

Definition 2.6. A function / G L^(G) is said to be smooth of order u > if c-^^if) is an 
invertible matrix for any tt £ G, and if there exists two constants Ci, C2 > such that 

\\cAfr% < CiK et \\cM)\\lp < for all n € d. 

Assumption 2.1. The density h is smooth of order 1/ > 0. 

Note that Assumption 12.11 corresponds to the case where, in most applications, the convolu- 
tion kernel h leads to an inverse problem that is ill-posed, meaning in particular that there is no 
bounded inverse deconvolution kernel. This can be seen in the assumption ||c7r(/)~^||^p < CiX'^ 
which accounts for the setting where lim;^^_^+oo ||c7r(/)~^ = +00 meaning that the mapping 
f ^ f * h does not have a bounded inverse in L^(G). Example of such convolution kernels 
are discussed in |KR01|, [KK08| . and we refer to these papers and references therein for specific 
examples. 

3 Upper and lower bounds 

The following theorem gives the asymptotic behavior of the quadratic risk of /t over Sobolev 
balls using an appropriate choice for the regularization parameter T. 

Theorem 3.1. Suppose that Assumption 12.11 holds. Let Jt be the estimator defined in (j2.4|) 
2 

with T = Tn= [n2=+2''+dim(G)j_ Let s > 2i^ + dim(G). Then, there exists a constant Ki > such 
that 

limsup sup n2'<+2-+di-(G)i?(/j.^,/*) < K^. 

n^oo f*eHs{G,A) 

Therefore, under Assumption 12.11 on the density h, Theorem 13.11 shows that the quadratic 
risk R{fT„,f*) is of polynomial order of the sample size n, and that this rate deteriorates as 
the smoothness v oi h increases. The fact that estimating /* becomes harder with larger of v 
(the so-called degree of ill-posedness) is well known in standard deconvolution problems (see e.g. 
|Fan91] and references therein). Hence, Theorem 13.11 shows that a similar phenomenon holds 
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in model (jl.ip when using the deconvolution step ()2.3p . The rate of convergence n 2s+2i^+dim(G) 
corresponds to the minimax rate in model ()1.2p for the problem of stochastic deconvolution over 
Lie groups as described in |KK08j . 

Then, thanks to the Theorem 13.21 below, there exists a connection between mean pattern 
estimation in the setting of Grenander's pattern theory |Gre931 IGMOTj and the analysis of 
deconvolution problems in nonparametric statistics. Indeed, in the following theorem, we derive 
an asymptotic lower bound on Hs{G,A) for the minimax risk TZn{A,s) which shows that the 

2s 

rate of convergence n 23+2i.+dim(G) cannot be improved. Thus, fT„ is an optimal estimator of /* 
in the minimax sense. 

Theorem 3.2. Suppose that Assumption 12.11 holds. Let s > 2z^ + dimG. Then, there exists a 
constant K2 > such that 

2s 

liminf inf sup n^'^+z^'+dimG /*) > ^2. 
^•■-^^ f&J\G) f*(iHs(G,A) 

A Technical Appendix 
A.l Proof of Theorem [3Jl 

By the classical bias/variance decomposition of the risk one has 

2 



E 



+ 



E -/ 



Let us first give an upper bound for the bias 



By linearity of the trace operator 



and by inverting expectation and sum (since Card(GT) is finite) one obtains that 
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ttGGt 





m=l 



- Y c?-Tr [7r(5)c^(r 

7reG\GT 

Since the {c-jr{Ym))m^s are i.i.d. random variables and E(c,r(ym)) = ^{cnifm)) = CTv{f*)c-K{h) we 
obtain that 



E /^-f 



that 



E /^-r 



y 



eG\G' 



E.eG\G,^-Tr [^(5)c^(/*)] 
d^Tr 



Then, by Theorem IB. 21 one has 



cAHcAn ■ Finally since vr ^ Gt and ||/|||,^ < A' 



we obtain the following upper bound for the bias 



E /^-r 



(A.l) 
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Let us now compute an upper bound for the variance term E 



E 



/^-e(/^) 












ttGGt 





< 2E 



\^ m=l /. 



m=l 



neGr 



El 



+ 2E 



1 " 



m=l 



E2 



using that C^^iYm) = C-„{fm) + SC^iWm)- 

Let us first consider the term E2. By Theorem IB.2I and by decomposing the trace 



E2 = e^E I ^ d^Tr 

\Tr€GT 



m,m'=l 

= e'ElYdA Y Y {cAWMh)-'),^C^jWMf¥^)\ 

XTreGT m,m'=lk,j = l J 

ttgGt m,m'=l k,j=li,i'=l j 

By the Fubini-TonneH theorem, we can invert sum and integral, and since (((c7r(Wm))A;/)fc ^ are 
i.i.d. Gaussian variables with zero expectation and variance d"^, it fohows that 



E2 = e^Y'^-^Yl E 



n 



Then thanks to the properties of the operator norm, one has Ylj=i \{'^Tr{h) < ||c7r(/i)" 

and therefore 

9 

-1||2 
Wop ' 



^2< - E ^'Ik-Wil- (A-2) 

TreGy 

Let us now compute an upper bound for Ei. Since C7r(/m) = C7r(/*)7r(r~^) and by Theorem lB.21 

<9) (cAn- Y <r-'Mh)-' - c^n 

\ m=l 

n 

{cAn-Y<^^'yAh)-' -^Ar 



E, 



E 



E ^'^'^^ 




ttSGt 





E I E 

\1TeGT 



m=l 
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By Fubini-Tonelli theorem, we invert sum and integral, and since the random variables are 
i.i.d. 



- V d^E (||c^(r)7r(rf i)c^(/i)-i - c^(r 



ttGGt 



n 



{\\c.{n<r^^)c.{hy% + \K{n\\l - 2Tr k(r)vr(rf i)c,(/i)-ic,(/'^) 



vrgGT 

where the last equality follows by definition of the Frobenius norm. Now remark that. 



E 



(tV [c,(r)vr(rfi)c,(/i)-ic,(/'^)* 



Tr 
Tr 



c^(r)E(7r(rfi))c.(/i)-ic^(/^) 



and let us compute E ^||c7r(/*)7r(r-|~^)c,r(/i)~^||^y Recall that 

\\PQ\\f<\\P\\f\\Q\\op 

for any P,Q G 7Wrf^xd^(C) and that the operator norm is a multiplicative norm, which implies 
that 



E(||c^(r)7r(rfi)c^(/i)- 



1||2 



-1||2 

Hop 



c.(r)||^E(||7r(Tr^)||M||c.(/i) 



op 



-1||2 



\op ' 



Since the operator norm is the smallest matrix norm one has that E ( ||vr(r-|^ ^) Hop ) — ^ ( 



-1^ 



Now since ||vr(r-|^ ^)\\'^p ~ ^r 



E ( ||7r(/i-|^ ^) II op) — ^^'^ therefore 



Tr [it{t-^ "'^)7r(ri)] = Tr[Idd^], it follows that 



-1||2 
1 1 op 



(A.3) 



tt£Gt 



Thus, combining the bounds (jA.2p and (jA.3p 



e(||/^-e(/^)||') < ^ E ^'(iic-(niiF(i|c.w-i',-^)+^'l|c.(/ir^l 



< ^ E di\\cAh)-X{\\cAn\\l + e^). 



TveGr 



Since /* G Hs{G,A), this implies that ||c^(/*)|||^ < M, for some constant M that is independent 
of vr and /*. Hence ||c7r(/*)||^ + £^ < (M + e^). Assumption 12.11 on the smoothness of h thus 
implies 



E 



ireGr 



tt&Gt 



< hl2^i^+(dim(G)/2) 

~ n 



(A.4) 



where the last inequality follows by Proposition IC.ll and C > is some constant that is inde- 
pendent of /* G Hs{G,A). Therefore, combining the bounds (|A.ip and (|A.25p it follows that 
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RUtJ*) < L{T) where L{T) = T-'A^ + |r^+(dim(G)/2) f^^^^^ ^^lai L{T) does not depend 
on /* G Hs{G, A)). Let us now search among the estimators (/j^)r the ones which minimize 
the upper bound of the quadratic risk. It is clear that the function T i— L{T) has a mini- 

2 2 -2s 2i^+dim(G) 

mum at T = [nZ^^ + a-^ + dimCG) J g^ch that L([n2^^ + 2'- + dim(G)J) < ^^n^^ + a-'+dimCC) _^ ky^2s + 2;. + dim(G) < 
-2s 

(;7"j^2s+2i^+dim(G) ^ which completes the proof of Theorem 13. II □ 



A.2 Proof of Theorem 13:2] 



To obtain a lower bound, we use an adaptation of the Assouad's cube technique (see e.g. |Tsy09| 
and references therein) to model (jl.ip which differs from the standard white noise models clas- 
sically studied in nonparametric statistics. Note that for any subset Q C Hs{G,A) 

inf sup R{f, n > inf sup R{f , /*). 

/ /*G/fs(G,A) / /*Gf^ 

The main idea is to find an appropriate subset of test functions that will allow us to compute 
an asymptotic lower bound for inf^- supj*gf^ /*) and thus the result of Theorem 13.21 will 

immediately follow by the above inequality. 



A. 2.1 Choice of a subset Q. of test functions 

Let us consider a set Vt of the following form: 



where Gd = jyT G G : D < < 2D^ and G To simplify the presentation of the 

~ —1/2 1/2 

proof, we will write = Let = HTrGGni"^'^ , (ijr }^ ■ In what follows, the notation 



w 



[w 



]^/2 1/2 

in {—^77 ,(^77 }• The notation E^^, will be used to denote expectation with respect to the 
distribution of the random processes Ym, m G [1, n] in model (jl.ip under the hypothesis that 
/ ~ fw- 

Note that any fw ^ ^ can be written as fw{g) = ^/I^dYIit^^Qd ^T^Tr ['K[g)wT^]^ where w-,^ = 
(w^7r,fcOi<fc,Krf,r- Let |0| = Card(r2) and let us search for a condition on /i/j such that Q, C 
Hs{G,A). Note that c,r(/u;) = JlHSw-w which implies 



eGD,l<k,l<d^ 



tGGbI 

G is used to denote the set of coefficients w-j^ ki taking their value 



fw(^Hs{G,A) 



< A^ 



ttGGd 

5^ (1 + K)iiDdl < A', 



ttGGd 



using the equality Tr [' 



-1/2 



k,l=l '^■nM 



(^Tr which follows from the fact that |t(^7r,A:/| 
Since vr G Go, one has that A,r < 2D, and thus ^J-oY^^^Q^d^ < ^^^D^^A^jl 



+ A^)/X£)d^ < j4^. Moreover by Proposition IC.ll we have that for D sufficiently 
large, X^^^g^ < (^^dimG/2^ £qj. ggj^g constant G > 0, and therefore for such a D, it follows 
that /iz) < l-'D-'-^'^'^^l'^i^A^ I2)C-^ => fJ.Dj2neGn^^^ - '^''D-'A^/2. Hence, there exists 
a sufficiently large Dq such that for all D > Do the condition ij,d ^ j^j^-s-dimG/2 gome 
K > Q (independent of D) implies that C Hs{G,A). In what follows, we thus assume that 
I2D = nD-'~'^''^^l'^ for some < k < and L> > L'o- 
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A. 2. 2 Minoration of the quadratic risk over 

Note that the supremum over Q of the quadratic risk of any estimator / can be bounded from 
below as follows. First, remark that by Theorem IB. 21 



sup R{f,fu 



71. kl 



= sup / - 

> sup ^ ^ f {c.^{f))ki - y/jlBw. 



kl=l 



= 7^ E E Ek-(|(^-(/)) 



kl — ylJ-DW-n^kl 



kl — V^J'DWn,kl 



with \n\ = 2^'^6Cd'^'. Now, define for ah vr € Gn, k,l G ll,dj the coefficients 



j^l = argmin 



The inequalities 



imply that jUd 





< 




< 




2 


Wn,ki - Km 


< 



- (C7r(/)) 



(A.5) 



kl 



fJ'DWwM - icTTif))ki , and thus by inequality (|A.5jl 



sup RifJ) > E ^'^ E E I 1'^'^''=' ~ 



^ ^ E E E ^^"[\w^,ki-w^M\ 



W7r,fei=f^i 



-1/2 



i(ir,fcl) 



(n,kl) * 
K,kl - Wn,kl 



where for all vr' G Gd, k',l' G [l,(i7rl; we define 
,{-,fcO = (u-^T'^,'),) is such that { 



^ w^J,'^!l = w^'^k'l' if vr' / vr or {k', I') / {k, I) 



'^^J'in' = -'^7r,kl if vr' = TT and (/c',/') = {k,l) 
Note that the above minoration depends on /. Let us introduce the notation 

(iT,kl) 
wl,,'- W. 



(A.6) 



TT,kl ~ ""ttM 



CttM ■= [\Wt,^m - wl,kl\ ) + ^wi^'l'') 

In what follows, we show that Cj^ kl can be bounded from below independently of /. 
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A. 2. 3 A lower bound for C, 



TT,kl 



Let vr G Gd, k,l £ [1,1^^] be fixed. Denote hy X = (c7r(5^m))(7r m)g(5x[i n] data set in the 
Fourier domain. In what fohows, the notation E,^^t is used to denote expectation with respect 
to the distribution f^^r of the random processes Ym,m E in model (jl.ip conditionahy to 

T = (ti, . . . , Tn) and under the hypothesis that /* = The notation = is used to denote 
the hypothesis /* = in model (jl.ip . Therefore, using these notations, one can write that 



kl 



Qn 



2 dP^ . 



(■K,kl) 

■kl ~ ^TT,kl 



Qn 



lEo,r \wTT,kl - Wl 



kl 



Qn 



Eo \wTr,ki - 



dPo,r 

2 dP^,, 



U,kl) * 

Km - 



h{Ti)...h{Tn)dTi...dTn, 

2 dP^(,r,fcn 



kl\ 



dPn 



{n,kl) ^ 



dPo,r 
2 dP,^(^,fc;) 



{X) h{Ti)...h{Tn)dn...dTn 



dPn 



/i(ri).../i(r„)dri...dr„. 



where the last equality follows from the fact that, under the hypothesis /* = 0, the data X 
in model do not depend on r. By inverting sum and integral, and using Fubini-Tonneli 

theorem we obtain 



Cn,kl 



En 



2 



+ 



(n,kl) * 
Km - Wn,kl 



Km - w^,ki 



2 dP. 



^(7r,fci) • 



dl 



(X) /i(Ti).../i(r„)dri...dr, 



dP 

'"^X)h{n)...h{rn)dn...drn 



Qn dPo 

dP^(,r,fc;) 



Qn 



dl 



-{X)h{Ti)...h{Tn) dri... dT„, 



Introduce the notations 
/g" dPo 



QiX) 



{X)h{ai)...h{an)dai...dan and Q(^''=')(X) 



G 



dP^{,r,fei),a 

n Wo 



{X)h{ai) . . .h{an) dai... da„. 



Since w^Jj^'^ -wl^^l = -Wn,kl~K,kl with lu^^fci G |-d,r^^^,rf7r^^^| and u;* G j-rf^^/^, d^^/^j, 
it follows that 

C,,H > 4a!;iEo(min(Q(X),Q(-'^'')(X) 

g{^,fcO(x) 



4d-iEo Q(X) min 1, 



QiX) 



E, 



Qn dPo 



■0 



Qn 



dPo 



(X)/i(ri).../i(r„)dri...dr„min |^1, 
(X)/i(Ti).../i(T„)dTi...dT„min 1, 



g{^,fcO(x)" 



Ett, T- min 1 



Qn 



4C^E,„ min 1, 



Q(^,fcO(x)~ 
Q{X) 

QiX) r 



Q{X) 

h{Ti)...h{Tn)dTi...dTn 



(A.7) 



Let us now compute a lower bound for E^ I min I 1, q/v) I I • Note that for any < 5 < 1 



E,,, min 



' Q{X) 



> 



Q{X) 

Q{X) 



> s 



(A. 
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Proposition A.l. Let tt € Go, k,l G be fixed. Let = Ki:>-^-dimG'/2 ^^^^ jj ^ 

. Suppose that s > 21/ + dimG. Then, there exists Q < 5 < 1 and a constant C > 



fl 2s+2i^+dimG 

such tliat 



Proof. Throughout the proof, we assume that fio = ^ dimG/2 ^^^^ _ j^2s+2i.+dimG . Xo 
simplify the presentation, we also write E = and P = P^. Then, thanks to Proposition 
IC.ll it follows that d^. ~ £)(dimG)/2 ^ therefore, under the assumption that 

s > 2z^ + dimG, one obtains the following relations (needed later on in the proof) 

nii%'^dl 0, n^i\)dl 0, ndlfij^D'" ^0 as n ^ +oo, (A.9) 

and 

nfiDD'" = 0{l) asn^+oo. (A. 10) 

Without loss of generality, we consider the case where w-,r,ki = —d-K^^"^ and w^^'^P = d-K^^"^ and 

€ = 1. To simplify the presentation, we also introduce the notation w-,^ = w^'^^\ In the proof, 
we also make repeated use of the fact that 

IIu^ttIIf = and ||u)7r|||' = C?7r- (A-11) 

Since C7r(^m) = \/T'd5'WTrTT{T^^) + c.,t{Wm) (uuder the hypothesis that /* = fw) and using the fact 

o 9 fJJP (tt kl) 

that lltL'Tillj;' = ll'u^TrlliT'; simple calculations on the likelihood ratios — g^(A) and — ""dPo ^'^^ 
yield that 

Q("'^'')(A) ^ Ul=i Ig '^MZirJ + Z^m^Mam) dam 
ni=i Ig exp(^i'^ + Z^^^)h{am) dam 

where 

Zm = dnl^Diw^rTriTm^), WnT^ia:;;^)) F , Z^^ = d^r^/JID{c^,{Wm) , WT,7r{am'')) F , 

= dT,flD{WnTT{T~^),W^7T{am^))F, Z^ = (i,r\/^(c,r(Wm), 'W^Vr(a~^))i;'. 

Note that by Cauchy-Schwarz's inequality 

lv(l)|2 ^ j2 2 II / -1mi2 II ~ / -1mi2 j2 2 n ||2 n ~ ||2 

and 

< dlfl'j;)\\w^TT{T-'^)fp\\w^TT{am^)fp = dlnjjWw^Wj, . 

Since the coefficients of the matrix CT^(Wm) are independent complex Gaussian random variables 
with zero expectation and variance d~^, one has that Zm^ (resp. Zm^) is a Gaussian random vari- 
able with zero mean and variance d7r^D||t(;7rVr(a~^)|||, = d7rjUD||it'7r|lF (resp. d.,rfiD\\wnTr{a:^)\\'jp = 
d-wlJ'DWwTrW'jp). Thence, by (|A.lip . one obtains that 

E|ZW|2 < ^2^dt E|Z«p < f^ldl and E|Z(?)p = ^^d^, E|Z(?)|2 = ^,J,dl. (A12) 

Therefore, ()A.9p and Markov's inequality imply that 

\& = o,in-') , = o,(n"i) , = o,{n~') , (A13) 

and 

= o,{n-') , \Zi^^f = o,{n'^) , = o,(n-i) . (A14) 
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Hence, using (jA.lSp . (|A.14p and the second order Taylor expansion exp(z) = 1 + z + ^ + 0[z^^ 
it follows that 



log 



Q{X) 



j = ^^log(^l + ^(^Z« + z(?) + ^|z(?)|2^/i(a„)da„ + Op(n^ 

log (^1 + 1^ + + j hiam) dam + op(n-i)) . 

^ + 0(z3) yield 



Then, using (|A.14p and the second order expansion log(l + z) = z 
log 



Q{x) ) 



(A.15) 



+ 2 

+ Op(l) 



G 



(A.16) 



Let us now study the expansion of the quadratic term ()A.16p . Since C7r(/i) = 7r(r„^)/i(am) da^ 
it follows by Cauchy-Schwarz's inequality that 



m=l 



Z^)/i(am) da^ 



G 



m=l 
j2 ,,2 11 „ ||4 



< ndit^llMUKmilp < C^ndtfilD-'- = 0(1) 



for some constant C2 > 0, where the last inequality is a consequence of Assumption 12. H the fact 
that < for Ayr S Gd and the third relation in ()A.9p . 

By Jensen's inequality and ()A.9p and since the Zm s are i.i.d. Gaussian random variables 
with zero mean and variance UDd^. one obtains that 

/ |Z^2)|2^(^^)d^^ - E / E|Z^2)|^/i(a^)da™<3n/i^4 = o(l), 

m=l L-''^ -I m=l 

2 

and thus Markov's inequality implies that Ylm=i fo\^m\'^h{am) dam = Op(l). Now, using 
(1X91) and (IATT2]) it follows that 



E 



i^j Z^m K'^m) da^^ (^j Z^^h{am)dar, 



< nfj^fdl = 0(1) 



which implies that (^/^ ■^m^/i(am) do™^ (^J^ zj^^ h{am) dam^ = Op{l). Finally, using 

(|A.14p . it follows that Ylm=i ( Ig Zmh{am) da^^ |Z^^p/i(Q„) da^^ = Op(l). By applying 
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the same arguments to the expansion of the quadratic term ()A.15P , one finally obtains that 

1, 



log 



Q{X) ) 



m=l ''G ^ / 
1 I f ~ 1 ^ 

Z^^^h{am) dam 



+ 2 

+ Op(l). 

Using that ||w7r|||' = ||u^7r|||' and the equality 

111 



one obtains that 
log 



Q{x) J 



ih))F (A.17) 

m=l 
n 

+ ^ d^r^/JID{c^r{Wm), (tf^Tr " W-n)cT,{h))F (A. 18) 

m=l 

" 1 /" n 

m=l •^'^ 

" 1 /" n 
~ E o / l^m^P^(am) dam + -d7r^n||t(^7r||F 
m=l 



(A.19) 
(A.20) 



- E ^ ( / ^m^^("m)dQ„ j + 



/^dII^I'^c^WIIf (A.21) 



m^F (A.22) 



n 



-d.„fiD\\{w-K - w-K)cT,{h)\\p + Op(l) 



(A.23) 



Control of the term ()A.23p . Thanks to Assumption 12.11 and the fact that < D ^ for 
K^Gd and (|AJ]), it follows by ()AlO]l that 

ndT,lJLD\\{w-n - 'U^7r)c,r(/i)||F < "-^77^0 || C^r (/l) || op II "^tt " ^ttHf < 'inHoD^" = 0(1) , (A.24) 

and thus the term (jA.23p is bounded in probability. 

Control of the term ()A.17p . Remark that ()A.9P can be used to prove that 

Var ^ dTrlJ.D{wnTT{T~^), {w-^ - w.„)ct,{K)) F < nd^^|, || |||. || (zZ;,^ - w.^)c.„{h)\\p 



\in=l 



< n(i^^|)||u;^|||,||t(;^ - 'W7r||Fl|c,r(/i)||op 

< Andl^ilD^'^ = o{l) , 

and therefore by Chebyshev's inequality the term ()A.17p converges to zero in probability. 
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Control of the term (|Al8l) . First, since the coefficients of the matrix CT^iWm) are independent 
complex Gaussian random variables with zero expectation and variance d~^, one has that = 
dn^ffj^icniWrn), (wtt — w-n-)cTr{h)) p are i.i.d. Gaussian random variables with zero mean and 
variance dT^noWiw-K — ?^7r)c7r(/i)|||'. Using inequality (jA.24p it follows that 



Var T'm = nd^fiDWiwiT - w^)c^{h)\\p = 0(1) 



(A.25) 



\m=l 



and standard arguments in concentration of Gaussian variables imply that for any t > 



m=l 



>t] < 2 exp 



2ndj,^D\\{w^ - W7r)Cn{h)\\j, 



(A.26) 



Therefore, combining ()A.25p and (|A.26|) imply that the term (jA.lSp is bounded in probability. 

Control of the terms (|A.19p and ()A.20p . Remark that Jensen's inequality, the fact that the 
Zm's are i.i.d. Gaussian random variables with zero mean and variance fJ-od'^ and (\A.9\i imply 
that 



Varf V / \Z^i^\^h{am)dar, 



V Var f / |Z^)|2/i(arn)da 

m=l V^G 



< 



< n E\z[^^\^h{ai)dai <3nn%di = o{l) , 
Jg 



and thus the terms ()A.19p and ()A.20p converge to zero in probability by Chebyshev's inequality. 

Control of the terms ()A.2ip and ()A.22p . Similarly, by Jensen's inequality and (|A.9p one has 

that 



h{ai) dai < Snfij^d^ = o(l) , 



and thus the terms (|A.2ip and ()A.22p converge to zero in probability by Chebyshev's inequality. 

Combining the above controls of the terms ()A.17p to ()A.23p . one obtains that log ( q(^x) 
is bounded in probability which completes the proof of Proposition lA.ll 

Now, recall that using (|A.6p and (|A.7p 



□ 



snp R{f,f) > d^Yl E 



kl 



tgGd '=''=^ wen 



1/2 



> E '^-E E min 1, 



Qi^,ki)(^X) 
Q{X) 



-1/2 



Combining inequality (jA.Sp and Proposition lA.ll one obtains that there exists a constant C > 



(not depending on n) such that with the choice D 



fl 2s + 2i' + dimG 



and for all sufficiently large 
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n 

snp R{f,f) > (I E E d-'C 

j-1/2 
«'7r,fci=a7r 

C — ^ 2 

— 1/2 

where we have the fact that for any vr, k, I the cardinahty of the set {w G with Wj^^ki = } 
is |f^|/2. 

Now, let < p < 1. Thanks to Proposition IC.ll it follows that for rj = pW^ 2dimG/2^i ; 
has that {W + 7/)DdimG/2 > ^^^g^ > (T^ - r;)DdimG/2 ^^^j sufficiently large L», where W 
is the constant defined in (jC.ip . Hence, 

E^' = E E dl 

tt&Gd 7r:A^<2D 7r:A^<_D 

> {W - 1]){2Df''^^l^ -{W + ry)DdimG/2 

= VF'Z)'^™^/^ with W' = {l- p)VF(2'ii^^/2 _ 1) > 0. 



Taking D 



2s + 2i^ + dimG 



and since = ^ dimG/2 finally obtain that 



j^2.+2.+dimG sup R{f,f) > n2»+2''+dimGKi:)-«-dimG/2^dimG/2 ^ 

for some constant > not depending on n, which completes the proof of Theorem 13. 2[ 



B Some background on noncommutative harmonic analysis 

In this appendix, some aspects of the theory of the Fourier transform on compact Lie groups are 
summarized. For detailed introductions to Lie groups and noncommutative harmonic analysis 
we refer to the books |Bum04t IDKOO^ Sep07] . Throughout the Appendix, it is assumed that G 
is a connected and compact Lie group. 



B.l Representations 

Definition B.l. Let y be a finite-dimensional C-vector space. A representation of G in y is 
a continuous homomorphism vr : G — >• GLiV), where GL{V) is the set of automorphisms of V . 
The representation vr is said to be irreducible if, for any g G G, the only invariant subspaces by 
the automorphism TT{g) are {0} and V. 

If G is a compact group and vr is an irreducible representation in V, then the vector space 
V is finite dimensional, and we denote by dj^ the dimension of V. By choosing a basis for V, it 
is often convenient to identify 7r{g) with a matrix of size d-,^ x with complex entries. 

Definition B.2. Two representations tt, vr' in V are called equivalent if there exists M G GL(V) 
such that Tr{g) = MTr'{g)M-^ for all g £ G. 

Definition B.3. A representation vr is said to be unitary if 7r{g) is a unitary operator for every 
geG. 

Let vr be a representation in V. Then, there exists an inner product on V such that vr is 
unitary. This means that any irreducible representation vr in 1/ is equivalent to an irreducible 
representation that is unitary. 
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Definition B.4. We denote by G the set of equivalence classes of irreducible representations of 
G, and we identify G to the set of unitary representations of each class. 

Proposition B.l. Let g G G and vr G G, then TT{g^^) = TT{g) ■ 
B.2 Peter- Weyl theorem 

Let TT G G be a representation in a Hilbert space V. Let 3-,^ = (ei, ...,6^^) a basis of V. For 
g £ G, denote by (pjjig) = {ei,Tr{g)ej) the coordinates of vr in the basis 13-,^ for i,j £ [1,^,^]. 

Theorem B.l. If G is a compact group then ( xfd^cb^A.) ) _ is an orthonormal basis 

of the Hilbert space L^(G) endowed with the inner product (/, h) = f{g)h{g)dg. 

B. 3 Fourier transform and convolution in L^(G') 
Let IT £ G and define for any / G L^(G) the linear mapping 

cAf) -.v^v 

V ^ f{g)TT{g) vdg= / f{g)-K{g-'^)v dg. 
JG Jg 

Note that the matrix CT^{f) is the generalization to functions in L^(G) of the usual notion of 
Fourier coefficients. 

Definition B.5. Let / G L^(G) and vr G G. We call c,r(/) the vr-th Fourier coefficient of /. 

Theorem B.2. Let / G L2(G). Then f{g) = T^^^Qd^Tr {7r{g)cM)) , and WfWl^^a) = 
^^ggd^Tr (^c^{f)c-,,{ff^ = ^^^Qd^\\cnif)\\p , where \\-\\p denotes the Frobenius norm of 
a matrix. 

Definition B.6. Let f,h £ L^(G). The convolution of / and h is defined as the function 
(/ * h){g) = f{g'-^g)h{g') dg' for g £ G. 

Proposition B.2. Let f,h£ L^(G) then C7r(/ * h) = CT^{f)cT^{h). 

C Laplace-Beltrami operator on a compact Lie group 

For further details on the material presented in this section we refer to the technical appendix 
in |KK08j and to the book |Far08j . In this section, we still assume that G is a connected 
and compact Lie group. In what follows, with no loss of generality, we identify (through an 
isomorphism) G to a subgroup of GLrxr{C) (the set of r x r nonsingular matrices with complex 
entries) for some integer r > 0. 

C. l Lie algebra 

Definition C.l. A one parameter subgroup of G is a group homomorphism c : M — )■ G. 

Theorem C.l. Let c : M — t- GLrxr(C) one parameter subgroup of GLrxr{C). Then c is C°° 

dc 

and c{t) = exp{tA), with A = —(0). 

Definition C.2. Let AirxriC) be the set of r x r matrices with complex entries. The mapping 
[., .] : MrxriCf MrxriC) : X,Y ^ [X, Y] = XY - YX is called a Lie bracket. A Lie algebra 
is the C-vector space q = {X £ A^rxr(C) : exp{tX) £ G\/t £ M} endowed with the bilinear form 
[.,.]: Q Q ^ Q : X,Y ^ [X,Y], which satisfies [X,Y\ = -[Y,X] and [[X,Y\,Z] + [[Y,Z\,X] + 
[[Z,X],Y] = (Jacobi identity). 
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Definition C.3. The Killing form is the bilinear form B defined by 



B:g^C:X,Y^Tv [ad{X)ad{Y)] , 
where ad{X) : g — >■ g : y i— >■ [X, Y] is an endomorphism of g. 

C.2 Roots of a Lie algebra 

A torus in G is a connected Abelian subgroup of G. It is well known that in a compact Lie 
group G, there exists (up to an isomorphism) a maximal torus. Let us fix such a maximal torus 
that we denote by T. Denote by t the Lie algebra of T, which is a maximal Abelian subalgebra 
of 0. Let f) = i + it be the complexification of t. Then, is a maximal Abelian subalgebra of g 
such that the linear transformations {ad{H))}{et) are simultaneously diagonalizable. Denote by 
f)* the dual space of f). Let a £ i)*, and define 

0" = {X G : Vi/ € f), [H,X] = a{H)X} . 

Definition C.4. a G f)* is said to be a root of with respect to f), if 0" is nonzero, and in this 
case 0" is called the corresponding root space. We also denote by <I> C f)* the set of roots. 

Each root space is of dimension 1. One has that 0" = f) (by the maximal property of f)) 
and can be decomposed as the following direct sum = f)0Qg5 0", called the root space 
decomposition of 0. To each a G <I> we associate the hyperplane Ha C f)* that is orthogonal to 
a. The set of all hyperplanes | Ha '■ a G ^| partition f)* into a finite number of open convex 
regions called the Weyl chambers of f)*. In what follows, we choose and fix a fundamental Weyl 
chamber denoted by K. 

Definition C.5. Let ^ be the set of real roots and = {a e $ : yj3 £ K (a,/3)} be the 
set of positive roots. Denote one-half of the sum of positive roots by p = ^ '}2a<^^+ 

C.3 Laplace-Beltrami operator 

The Laplace-Beltrami operator is a generalization to Riemannian manifolds (such as Lie groups) 
of the usual Laplacian operator. We will denote this operator by A. To state the following 
proposition, note that one may identify the set G with a subset of (see the technical appendix 
in [KK08) for further details on this identification). 



Proposition C.l. The elements of G are the eigenfunctions of A. Let vr S G. The eigenvalue 

2||||2 — 

of TT is = ||vr + p\\ — \\p\\ , where || • || is the norm induced by the Killing form. For tt £ G 
one has the following relationship between dj^ and A,, 



^ dl = T^r(^^-«)/2 + o(t('1'-«)/2) as T ^ 00, 

7reG:A^<T 

where 

W = ^.-^ , (C.l) 

(20F)"'"^^r(l + idimG) 

with volG denoting the volume of G, the bold symbol tt denoting the number Pi and r(.) being 
the classical gamma function. 
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